Mining of Web Server Logs using Extended Apriori Algorithm
نویسنده
چکیده
Association rule mining is one of the most significant techniques in the field of data mining. It is very useful in discovering relationships hidden in large transaction datasets such as frequent patterns, associations etc. One of the popular and important algorithms in this category is Apriori algorithm which finds frequent itemsets using an iterative approach. But it suffers from a major limitation that in case of large databases, it requires a large number of passes while searching the frequent itemsets, thus increasing its scanning time. In order to lessen this time, an improved version of Apriori algorithm, called Extended Apriori is proposed in this paper which decreases the number of transactions in the database, hence reducing size of the database so as to minimize the scanning time. This extended algorithm is then used to mine web server logs of an educational web site in order to discover frequently visited pages by the user and also its performance is compared graphically with existing Apriori algorithm. In the end, the paper also outlines some future research directions in the area of web server log mining.
منابع مشابه
Implementation of Web Usage Mining Using APRIORI and FP Growth Algorithms
-----------------------------------------------------------------------ABSTRACT -------------------------------------------------------------Web Usage Mining is the application of data mining techniques to discover interesting usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Usage data captures the identity or origin of Web users along w...
متن کاملFrequent Pattern Mining of Web Log Files Working Principles
Frequent pattern mining plays a major role in mining of web log files. Web usage mining is the one of the web mining process that involves application of mining techniques to web server logs to extract the behavior of users. A web usage mining consists of three important phases: data preprocessing, patterns discovery and pattern analysis. In data preprocessing phase the unwanted data are remove...
متن کاملEfficient Frequent Pattern Mining on Web Log Data
Mining frequent patterns from web log data can help to optimise the structure of a web site and improve the performance of web servers. Web users can also benefit from these frequent patterns. Many efforts have been done to mine frequent patterns efficiently. Candidate-generation-and-test approach (Apriori and its variants) and pattern-growth approach (FP-growth and its variants) are the two re...
متن کاملEffective web log mining and online navigational pattern prediction
The web has become the world's largest repository of knowledge. Web usage mining is the process of discovering knowledge from the interactions generated by the user in the form of access logs, cookies, and user sessions data. Web Mining consists of three different categories, namely Web Content Mining, Web Structure Mining, and Web Usage Mining (is the process of discovering knowledge from the ...
متن کاملA new outlier detection approach to discover low hit web pages using sequential frequent pattern mining to improve website’s design
The Internet offers huge volume of data to the users and grows rapidly every day. The web server creates log files regarding details about the page, IP address of the user, browser, and operating system used and time/date stamp regarding browsing patterns and this data is mined to extract useful information using web usage mining. The primary objective of this paper is to find the low hit pages...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013